The RODRIGO Database

نویسندگان

  • Nicolás Serrano
  • Francisco Castro
  • Alfons Juan-Císcar
چکیده

Annotation of digitized pages from historical document collections is very important to research on automatic extraction of text blocks, lines, and handwriting recognition. We have recently introduced a new handwritten text database, GERMANA, which is based on a Spanish manuscript from 1891. To our knowledge, GERMANA is the first publicly available database mostly written in Spanish and comparable in size to standard databases. In this paper, we present another handwritten text database, RODRIGO, completely written in Spanish and comparable in size to GERMANA. However, RODRIGO comes from a much older manuscript, from 1545, where the typical difficult characteristics of historical documents are more evident. In particular, the writing style, which has clear Gothic influences, is significantly more complex than that of GERMANA. We also provide baseline results of handwriting recognition for reference in future studies, using standard techniques and tools for preprocessing, feature extraction, HMM-based image modelling, and language modelling.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Applying Islamic sources in Spain’s medieval histories: a case study of the Historia Arabum and Historia Gothica emphasizing the conquest of Andalusia and the life of Muhammad (PBUH)

Historia Gothica and Historia Arabum written by the Spanish prominent bishop, Rodrigo Jimenez de Rada, are of the most outstanding medieval historical works. Historia Arabum is the first Western work on Islamic history due to attention to Islamic history, which led to the importance of Rodrigo Jimenez among historians. The main question of the article is to what extent Islamic sources have infl...

متن کامل

MIDAS – Mammographic Image Database for Automated Analysis

Fabiano Fernandes1, Rodrigo Bonifácio2, Lourdes Brasil3, Renato Guadagnin4 and Janice Lamas5 1Instituto Federal de Brasília, 2Computer Science Department, University of Brasília, 3Post-Graduate Program in Biomedical Engineering, University of Brasília at Gama 4Post-Graduate Program in Knowledge Management and Information Technology, Catholic University of Brasília, 5Janice Lamas Radiology Clini...

متن کامل

Call Accounting in a VoIP Infrastructure

An H.323 CDR collecting and consolidating architecture based on Radius is presented. The important fields involved in the consolidation process for Cisco gateways and GnuGK CDRs are investigated and a unique record format is defined. From the database, various statistics are extracted including disconnection causes, call distribution over a day, number of calls over a day period, call quality a...

متن کامل

Brief Announcement: Optimistic Algorithms for Partial Database Replication

Database replication protocols based on group communication have recently received a lot of attention. The main reason for this stems from the fact that group communication primitives offer adequate properties, namely agreement on the messages delivered and on their order, to implement synchronous database replication. Most of the complexity involved in synchronizing database replicas is handle...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010